278 research outputs found

    Determining maximum k-width-connectivity on meshes

    Get PDF
    AbstractLet I be a n × n binary image stored in a n × n mesh of processors with one pixel per processor. Image I is k-width-connected if, informally, between any pair of 1-pixels there exists a path of width k (composed of 1-pixels only). We consider the problem of determining the largest integer k such that I is k-width-connected, and present an optimal O(n) time algorithm for the mesh architecture

    Yeast Features: Identifying Significant Features Shared Among Yeast Proteins for Functional Genomics

    Get PDF
    Background
High throughput yeast functional genomics experiments are revealing associations among tens to hundreds of genes using numerous experimental conditions. To fully understand how the identified genes might be involved in the observed system, it is essential to consider the widest range of biological annotation possible. Biologists often start their search by collating the annotation provided for each protein within databases such as the Saccharomyces Genome Database, manually comparing them for similar features, and empirically assessing their significance. Such tasks can be automated, and more precise calculations of the significance can be determined using established probability measures. 
Results
We developed Yeast Features, an intuitive online tool to help establish the significance of finding a diverse set of shared features among a collection of yeast proteins. A total of 18,786 features from the Saccharomyces Genome Database are considered, including annotation based on the Gene Ontology’s molecular function, biological process and cellular compartment, as well as conserved domains, protein-protein and genetic interactions, complexes, metabolic pathways, phenotypes and publications. The significance of shared features is estimated using a hypergeometric probability, but novel options exist to improve the significance by adding background knowledge of the experimental system. For instance, increased statistical significance is achieved in gene deletion experiments because interactions with essential genes will never be observed. We further demonstrate the utility by suggesting the functional roles of the indirect targets of an aminoglycoside with a known mechanism of action, and also the targets of an herbal extract with a previously unknown mode of action. The identification of shared functional features may also be used to propose novel roles for proteins of unknown function, including a role in protein synthesis for YKL075C.
Conclusions
Yeast Features (YF) is an easy to use web-based application (http://software.dumontierlab.com/yeastfeatures/) which can identify and prioritize features that are shared among a set of yeast proteins. This approach is shown to be valuable in the analysis of complex data sets, in which the extracted associations revealed significant functional relationships among the gene products.
&#xa

    Mass customization of teaching and learning in organizations

    Get PDF
    In search of methods that improve the efficiency of teaching and training in organizations, several authors point out that mass customization (MC) is a principle that covers individual needs of knowledge and skills and, at the same time, limits the development costs of customized training to those of mass training. MC is proven and established in the economic sector, and shows high potential for continuing education, too. The paper explores this potential and proposes a multidisciplinary, pragmatic approach to teaching and training in organizations. The first section of the paper formulates four design principles of MC deduced from an examination of economics literature. The second section presents amitâ„¢, a frame for mass customized training, designed according to the principles presented in the first section. The evaluation results encourage the further development and use of mass customized training in continuing education, and offer suggestions for future research

    Rahapeliriippuvuus hallintaan -menetelmäkoulutus

    Get PDF
    This paper proposes an efficient algorithm to compress the cubes in the progress of the parallel data cube generation. This low overhead compression mechanism provides block-by-block and record-by-record compression by using tuple difference coding techniques, thereby maximizing the compression ratio and minimizing the decompression penalty at run-time. The experimental results demonstrate that the typical compression ratio is about 30:1 without sacrificing running time. This paper also demonstrates that the compression method is suitable for Hilbert Space Filling Curve, a mechanism widely used in multi-dimensional indexing

    Binding Site Prediction for Protein-Protein Interactions and Novel Motif Discovery using Re-occurring Polypeptide Sequences

    Get PDF
    Background: While there are many methods for predicting protein-protein interaction, very few can determine the specific site of interaction on each protein. Characterization of the specific sequence regions mediating interaction (binding sites) is crucial for an understanding of cellular pathways. Experimental methods often report false binding sites due to experimental limitations, while computational methods tend to require data which is not available at the proteome-scale. Here we present PIPE-Sites, a novel method of protein specific binding site prediction based on pairs of re-occurring polypeptide sequences, which have been previously shown to accurately predict proteinprotein interactions. PIPE-Sites operates at high specificity and requires only the sequences of query proteins and a database of known binary interactions with no binding site data, making it applicable to binding site prediction at the proteome-scale. Results: PIPE-Sites was evaluated using a dataset of 265 yeast and 423 human interacting proteins pairs with experimentally-determined binding sites. We found that PIPE-Sites predictions were closer to the confirmed binding site than those of two existing binding site prediction methods based on domain-domain interactions, when applied to the same dataset. Finally, we applied PIPE-Sites to two datasets of 2347 yeast and 14,438 human novel interacting protein pairs predicted to interact with high confidence. An analysis of the predicted interaction sites revealed a number of protein subsequences which are highly re-occurring in binding sites and which may represent novel binding motifs. Conclusions: PIPE-Sites is an accurate method for predicting protein binding sites and is applicable to the proteome-scale. Thus, PIPE-Sites could be useful for exhaustive analysis of protein binding patterns in whole proteomes as well as discovery of novel binding motifs. PIPE-Sites is available online a

    Optical clustering on a mesh-connected computer

    Get PDF

    A distributed tree data structure for real-time OLAP on cloud architectures

    Get PDF
    In contrast to queries for on-line transaction processing (OLTP) systems that typically access only a small portion of a database, OLAP queries may need to aggregate large portions of a database which often leads to performance issues. In this paper we introduce CR-OLAP, a Cloud based Real-time OLAP system based on a new distributed index structure for OLAP, the distributed PDCR tree, that utilizes a cloud infrastructure consisting of (m + 1) multi-core processors. With increasing database size, CR-OLAP dynamically increases m to maintain performance. Our distributed PDCR tree data structure supports multiple dimension hierarchies and efficient query processing on the elaborate dimension hierarchies which are so central to OLAP systems. It is particularly efficient for complex OLAP queries that need to aggregate large portions of the data warehouse, such as 'report the total sales in all stores located in California and New York during the months February-May of all years'. We evaluated CR-OLAP on the Amazon EC2 cloud, using the TPC-DS benchmark data set. The tests demonstrate that CR-OLAP scales well with increasing number of processors, even for complex queries. For example, on an Amazon EC2 cloud instance with eight processors, for a TPC-DS OLAP query stream on a data warehouse with 80 million tuples where every OLAP query aggregates more than 50% of the database, CR-OLAP achieved a query latency of 0.3 seconds which can be considered a real time response
    • …
    corecore